On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields

نویسندگان

Georg Heigold

Ralf Schlüter

Hermann Ney

چکیده

In this work we show that Gaussian HMMs (GHMMs) are equivalent to GHMM-like Hidden Conditional Random Fields (HCRFs). Hence, improvements of HCRFs over GHMMs found in literature are not due to a refined acoustic modeling but rather come from the more robust formulation of the underlying optimization problem or spurious local optima. Conventional GHMMs are usually estimated with a criterion on segment level whereas hybrid approaches are based on a formulation of the criterion on frame level. In contrast to CRFs, these approaches do not provide scores or do not support more than two classes in a natural way. In this work we analyze these two classes of criteria and propose a refined frame based criterion, which is shown to be an approximation of the associated criterion on segment level. Experimental results concerning these issues are reported for the German digit string recognition task Sietill and the large vocabulary English European Parliament Plenary Sessions (EPPS) task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

متن کامل

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

The use of Deep Belief Networks (DBN) to pretrain Neural Networks has recently led to a resurgence in the use of Artificial Neural Network Hidden Markov Model (ANN/HMM) hybrid systems for Automatic Speech Recognition (ASR). In this paper we report results of a DBN-pretrained context-dependent ANN/HMM system trained on two datasets that are much larger than any reported previously with DBN-pretr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields

نویسندگان

چکیده

منابع مشابه

Speech enhancement based on hidden Markov model using sparse code shrinkage

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

عنوان ژورنال:

اشتراک گذاری